Reverberant Speech Recognition Combining Deep Neural Networks and Deep Autoencoders

نویسندگان

Masato Mimura

Shinsuke Sakai

Tatsuya Kawahara

چکیده

We propose an approach to reverberant speech recognition adopting deep learning in front end as well as back end of the system. At the front end, we adopt a deep autoencoder for enhancing the speech feature parameters, and the recognition is performed using a DNN-HMM acoustic models trained on multi-condition data. The system was evaluated through the ASR task in Chime Challenge 2014. The DNN-HMM system trained on the multi-condition training set achieved a conspicuously higher word accuracy compared to the MLLR-adapted GMM-HMM system trained on the same data. Furthermore, feature enhancement with the deep autoencoder contributed to the improvement of recognition accuracy especially in the more adverse conditions. When the DNN-HMM was used without the deep autoencoder front end, it resulted in a better performance than the non-adapted GMM-HMM system, but was not as good as the adapted GMM-HMM system. However, it outperformed the adapted GMM-HMM system when combined with the deep autoencoder.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

We propose an approach to reverberant speech recognition adopting deep learning in the front-end as well as back-end of a reverberant speech recognition system, and a novel method to improve the dereverberation performance of the front-end network using phone-class information. At the front-end, we adopt a deep autoencoder (DAE) for enhancing the speech feature parameters, and speech recognitio...

متن کامل

Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)

Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Advances in Deep Learning

Deep neural networks have become increasingly more popular under the name of deep learning recently due to their success in challenging machine learning tasks. Although the popularity is mainly due to the recent successes, the history of neural networks goes as far back as 1958 when Rosenblatt presented a perceptron learning algorithm. Since then, various kinds of artificial neural networks hav...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Reverberant Speech Recognition Combining Deep Neural Networks and Deep Autoencoders

نویسندگان

چکیده

منابع مشابه

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature

Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Advances in Deep Learning

Speech Emotion Recognition Using Scalogram Based Deep Structure

عنوان ژورنال:

اشتراک گذاری